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COMPUTER INPUT MICROFILM (CIM) FEASIBILITY STUDY 


By J. B. Burford and J. M. Clarki/ 


SUMMARY 

This feasibility study determined that Computer Input Microfilm 
(CIM) techniques can be used with a high degree of accuracy to 
convert hydrologic data recorded in Computer Output Microfilm (COM) 
to compu t er ““ r eadab le magnetic tape. Many beneficial aspects that 
will improve overall accuracy and reduce conversion costs were 
determined also. 

After the proposed changes have been incorporated into the COM 
procedures, and as funds and time permit, a confirmation CIM pro- 
duction run is planned. 

Computer Output Microfilm will be used as the backup medium for 
ARS hydrologic data-bank storage. Data format will be compatible 
with that required for CIM conversion to magnetic tape. 


INTRODUCTION 

The Agricultural Research Service’s Hydrologic Data Laboratory 
is responsible for developing and maintaining a bank of hydrologic 
data and related information obtained at the various ARS Watershed 
Research Centers. These headquarters are located at Athens, Ga.; 
Boise, Idaho; Burlington, Vt.; Chickasha and Stillwater, Okla.; 
Columbia, Mo.; Coshocton, Ohio; Oxford, Miss.; Temple, Tex.; 
Tucson, Ariz.; and University Park, Pa. 

Comput er- readable magnetic tapes are used as the medium for 
manipulating and storing the active volumes of data. The value 
and uniqueness of the data dictate that a dependable and positive 
backup system be used to insure against unforeseeable mishaps and 
disasters such as total loss of data.^' Since information stored 
as magnetic charges in a metallic oxide coating of plastic film 
is quite vulnerable to such threats as stray magnetic fields, 
uncontrolled environment, and deterioration of the plastic film 
during extended periods, special facilities that have a controlled 
environment are required (fig. 1). 


1/ Hydraulic engineer and computer specialist, respectively, 
Hydrologic Data Laboratory, Plant Physiology Institute, North- 
east ern Region , Agricultural Research Service , U . S . Department of 
Agr icul ture , Bel ts ville , Md, 20705* 

2J Panorama* Business Systems Market Div., Eastman Kodak Go., 
March 1973. 
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Figure 1.— Special facilities 
ydraulic data bank recorded in 
tape. 
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DISCUSSION 


Tferoiigli cooperative efforts, CIH techniques that use a combina- 
tion of the ca thoder-.ray tube and optical character recognition 
principles were developed to read the COH backup copies and to 
convert them to magnetic tape, if required. This is a brief report 
of a CIM feasibility study that was recently completed by the 
Hydrologic Bata Laboratory. 

The Laboratory obtained COM copies of sample hydrologic data 
stored on magnetic tape to use in the study. The COM copies were 
obtained with a service-bureau-operated FR-80 COM recorder 
(fig. 4). The COM data were in the standard, human-readable, 
hard-copy printout format, reduced (24X) as required to fit on 
16“-iiiii unsprocketed microfilm. A review of the sample hydrologic 
COM data indicated that the image-processing system that had been 
developed by Information International, Inc., in the GRAFIX I, 
was capable of converting the microfilm images back to the 
computer-readable, magnetic-tape form. V 
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A contract was negotiated to convert about 265,000 characters 
from COM to a magnetic- tape (CIM) version, with an accuracy goal 
of 99.5 percent, and to furnish the Hydrologic Data Laboratory 
with a copy of the generated tape for a character-by-character 
comparison against the original tape. 

The GRAFIX I image-processing system is designed to substitute 
(or to reject and flag) character images that do not compare 
with benchmark characters within specified limits. The degree 
of substitution or rejection can be influenced by the limits set 
in the system. 

The CIM-genera ted tape contained 2,999 records with 99 charac- 
ters per record, for a total of 296,901 characters. The 
char act er-by- char ac ter comparison, using computer logic, determ- 
ined that 63 characters had been rejected or misinterpreted. 

This represents an average of 1 rejected or wrong character for 
each 4,173 characters read, for an accuracy of 99.98 percent. 
This margin of error is well within the accuracy goal. The 63 
rejected or wrong characters were distributed among 53 data 
records, or 1 incorrect record for each 56.5 records converted, 
for an accuracy of 98.22 percent. 


CONCLUSIONS 

Many beneficial facts were realized or confirmed from the 
study. These facts were: 

1. Hydrologic data recorded in computer output microfilm 
can be converted to magnetic tape with a high degree of 
accuracy. This conversion was accomplished at a cost of 
$0.75 per 1,000 characters, excluding the cost required 
to modify and develop the software. 

2. Difficulties are encountered in keeping the image 
recognition system oriented when two or more adjacent 
blank spaces occur in a record. Many of the 63 errors 
resulted from efforts to handle multiple blank-space 
situations . -L' 

3. Several errors that occurred resulted from difficulties 
encountered in recognizing the decimal (.) character. — 


y A COM data format designed to eliminate multiple blank 
spaces should eliminate most conversion errors by filling blank- 
data fields with zeros and by allowing only one blank column 
between data fields. 

3/ A physically larger decimal font used in creating the COM 
would help to eliminate the problem of a decimal (.) character 
re CO gni t ion . 
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of CIM-gener ated , magnetic- tape data is related 
to the quantity of characters converted. The 
format should be designed so that only the data 
required for regeneration are converted. 
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Raster marks included in each frame of COM data are very 
helpful in the orientation of the image recognition sys- 
tem. Forms overlay techniques available in the COM 
equipment can be used to add these fiducials. 


A system for checking the integrity 
data should be built into the COM ve 
would entail an algorithm to review 
in each record and then to compute a 
record it at the end of the record, 
be checked with the same algorithm a 
version. 


of the CIM-generated 
rsion. This system 
the several digits 
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7. The nature of hydrologic data is such that questionable 
character images should be rejected and flagged, as 
opposed to substituted. 
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